🤖 AI Inference - nmarshall · Scour

Introducing dotLLM - Building an LLM Inference Engine in C# 🏗️AI Infrastructure

kokosa.dev·12h·Hacker News

Quantization, LoRA, and the 8% Problem: Benchmarking Local LLMs for Production AI 📱Edge AI

walsenburgtech.com·3d·Hacker News

amitshekhariitbhu/llm-internals: Learn LLM internals step by step - from tokenization to attention to inference optimization. 💻Local LLMs

github.com·1d·Hacker News

Show HN: Omega Walls–open-source stateful runtime defense for RAG and AI agents 🤖AI agents

synqra.tech·15h·Hacker News

Show HN: Dbg – One CLI debugger for every language (AI-agent ready) 🤖AI Coding Tools

redknightlois.github.io·1d·Hacker News

Redefining AI Inference With New Silicon Architecture ⚡Hardware Acceleration

semiengineering.com·5d

Your LLM is a compiler, not a runtime ⚙️LLVM

getpocketbot.com·1d·Hacker News

LLM inference engine written ground-up natively in C#/.NET 🏗️AI Infrastructure

dotllm.dev·11h·Hacker News

Compare TEE-Based AI Providers 🏗️AI Infrastructure

confidentialinference.net·6d·Hacker News

Neural Computers 🧠Neuromorphic Hardware

arxiv.org·6d·Hacker News, Hacker News

Semidynamics Secures SK hynix Investment to Advance Memory-Centric AI Inference Architecture 🏗️AI Infrastructure

hpcwire.com·4d·Hacker News

Guardrails at the gateway: Securing AI inference on GKE with Model Armor 🏠Self-hosted AI

cloud.google.com·5d

LLM inference, optimized for your Mac 💻Local LLMs

omlx.ai·4d·Hacker News

Predict-Rlm: The LLM Runtime That Lets Models Write Their Own Control Flow 🏠Self-hosted AI

repo-explainer.com·3d·Hacker News

Inside the Token Factory: A First-Principles Comparison of vLLM and SGLang ⚙️LLVM

hxu296.github.io·3d·Hacker News

I Ran My KYB Engine at Three Quantization Levels. Accuracy Didn't Move. Cost Dropped 6x. 📱Edge AI

walsenburgtech.com·5d·Hacker News

Fast Isn’t Fast Enough: Redefining Metrics for Edge AI 📱Edge AI

semiengineering.com·5d

AsyncTLS: Efficient Generative LLM Inference with Asynchronous Two-level Sparse Attention 💻Local LLMs

arxiv.org·5d

An Engineering Roadmap Toward Completely Neural Computers (Meta AI, KAUST) 🧠Neuromorphic Chips

semiengineering.com·3d

Externalization in LLM Agents: A Unified Review of Memory, Skills, Protocols and Harness Engineering 🤖Reinforcement Learning

arxiv.org·5d·Hacker News

No more posts from nmarshall's subscribed feeds.

Scour all 23985 feeds Learn more about Feeds